176 research outputs found

    Sparse Support Matrix Machines for the Classification of Corrupted Data

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Support matrix machine is fragile to the presence of outliers: even few corrupted data points can arbitrarily alter the quality of the approximation, What if a fraction of columns are corrupted? In real world, the data is noisy and most of the features may be redundant as well as may be useless, which in turn affect the classification performance. Thus, it is important to perform robust feature selection under robust metric learning to filter out redundant features and ignore the noisy data points for more interpretable modelling. To overcome this challenge, in this work, we propose a new model to address the classification problem of high dimensionality data by jointly optimizing the both regularizer and hinge loss. We combine the hinge loss and regularization terms as spectral elastic net penalty. The regularization term which promotes the structural sparsity and shares similar sparsity patterns across multiple predictors. It is a spectral extension of the conventional elastic net that combines the property of low-rank and joint sparsity together, to deal with complex high dimensional noisy data. We further extends this approach by combining the recovery along with feature selection and classification could significantly improve the performance based on the assumption that the data consists of a low rank clean matrix plus a sparse noise matrix. We perform matrix recovery, feature selection and classification through joint minimization of p,q-norm and nuclear norm under the incoherence and ambiguity conditions and able to recover intrinsic matrix of higher rank and recover data with much denser corruption. Although, above both methods takes full advantage of low rank assumption to exploit the strong correlation between columns and rows of each matrix and able to extract useful features, however, are originally built for binary classification problems. To improve the robustness against data that is rich in outliers, we further extend this problem and present a novel multiclass support matrix machine by utilizing the maximization of the inter-class margins (i.e. margins between pairs of classes). We demonstrate the significance and advantage of our methods on different available benchmark datasets such as person identification, face recognition and EEG classification. Results showed that our methods achieved significantly better performance both in terms of time and accuracy for solving the classification problem of highly correlated matrix data as compared to state-of-the-art methods

    Efficient Brain Tumor Segmentation with Multiscale Two-Pathway-Group Conventional Neural Networks

    Get PDF
    © 2013 IEEE. Manual segmentation of the brain tumors for cancer diagnosis from MRI images is a difficult, tedious, and time-consuming task. The accuracy and the robustness of brain tumor segmentation, therefore, are crucial for the diagnosis, treatment planning, and treatment outcome evaluation. Mostly, the automatic brain tumor segmentation methods use hand designed features. Similarly, traditional methods of deep learning such as convolutional neural networks require a large amount of annotated data to learn from, which is often difficult to obtain in the medical domain. Here, we describe a new model two-pathway-group CNN architecture for brain tumor segmentation, which exploits local features and global contextual features simultaneously. This model enforces equivariance in the two-pathway CNN model to reduce instabilities and overfitting parameter sharing. Finally, we embed the cascade architecture into two-pathway-group CNN in which the output of a basic CNN is treated as an additional source and concatenated at the last layer. Validation of the model on BRATS2013 and BRATS2015 data sets revealed that embedding of a group CNN into a two pathway architecture improved the overall performance over the currently published state-of-the-art while computational complexity remains attractive

    Big data analytics for preventive medicine

    Get PDF
    © 2019, Springer-Verlag London Ltd., part of Springer Nature. Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations

    A Comprehensive Survey on Word Representation Models: From Classical to State-Of-The-Art Word Representation Language Models

    Full text link
    Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP related tasks. In the end, this survey briefly discusses the commonly used ML and DL based classifiers, evaluation metrics and the applications of these word embeddings in different NLP tasks

    Progressive Class-Wise Attention (PCA) Approach for Diagnosing Skin Lesions

    Full text link
    Skin cancer holds the highest incidence rate among all cancers globally. The importance of early detection cannot be overstated, as late-stage cases can be lethal. Classifying skin lesions, however, presents several challenges due to the many variations they can exhibit, such as differences in colour, shape, and size, significant variation within the same class, and notable similarities between different classes. This paper introduces a novel class-wise attention technique that equally regards each class while unearthing more specific details about skin lesions. This attention mechanism is progressively used to amalgamate discriminative feature details from multiple scales. The introduced technique demonstrated impressive performance, surpassing more than 15 cutting-edge methods including the winners of HAM1000 and ISIC 2019 leaderboards. It achieved an impressive accuracy rate of 97.40% on the HAM10000 dataset and 94.9% on the ISIC 2019 dataset

    Sub-sampling Approach for Unconstrained Arabic Scene Text Analysis by Implicit Segmentation based Deep Learning Classifier

    Get PDF
    The text extraction from the natural scene image is still a cumbersome task to perform. This paper presents a novel contribution and suggests the solution for cursive scene text analysis notably recognition of Arabic scene text appeared in the unconstrained environment. The hierarchical sub-sampling technique is adapted to investigate the potential through sub-sampling the window size of the given scene text sample. The deep learning architecture is presented by considering the complexity of the Arabic script. The conducted experiments present 96.81% accuracy at the character level. The comparison of the Arabic scene text with handwritten and printed data is outlined as well

    A Multi-Modal Dataset for Hate Speech Detection on Social Media: Case-study of Russia-Ukraine Conflict

    Get PDF
    Hate speech consists of types of content (e.g. text, audio, image) that express derogatory sentiments and hate against certain people or groups of individuals. The internet, particularly social media and microblogging sites, have become an increasingly popular platform for expressing ideas and opinions. Hate speech is prevalent in both offline and online media. A substantial proportion of this kind of content is presented in different modalities (e.g. text, image, video). Taking into account that hate speech spreads quickly during political events, we present a novel multimodal dataset composed of 5680 text-image pairs of tweets data related to the Russia-Ukraine war and annotated with a binary class:”hate” or”no-hate” The baseline results show that multimodal resources are relevant to leverage the hateful information from different types of data. The baselines and dataset provided in this paper may boost researchers in direction of multimodal hate speech, mainly during serious conflicts such as war contexts
    corecore